Running Title: Protein-Protein Interaction Sites Predicting Protein-Protein Interaction Sites From Amino Acid Sequence

نویسندگان

  • Changhui Yan
  • Vasant Honavar
  • Drena Dobbs
چکیده

We describe an approach for computational prediction of protein-protein interaction sites using a support vector machine (SVM) classifier. Interface residues and other surface residues were extracted from 115 proteins derived from a set of 70 heterocomplexes in PDB. The SVM classifier was trained to predict whether or not a surface residue is located in the interface based on the identity of the target residue and its 10 sequence neighbors. The effectiveness of the approach was evaluated using 115 leave-one-out cross validation (jack-knife) experiments. In each experiment, an SVM classifier was trained using a set of 1250 randomly chosen interface residues and an equal number of non-interface residues from 114 of the 115 molecules. The resulting classifier was used to classify surface residues from the remaining molecule into interface and non-interface residues. The classifier in each experiment was evaluated in terms of several performance measures. In results averaged over 115 experiments, interface residues and non-interface residues were identified with relatively high specificity (71%) and sensitivity (67%), and with a correlation coefficient of 0.29 between predicted and actual class labels, indicating that the method performs substantially better than chance (zero correlation). We also investigated the classifier's performance in terms of overall interactions site recognition. In 80% of the proteins, the classifier recognized the interaction surface by identifying at least half of the interface residues, and in 98% of the proteins, at least 20% of the interface residues were correctly identified. The success of this approach was confirmed by examination of predicted interfaces in the context of the three-dimensional structures of representative complexes. This study demonstrates that an SVM classifier can be used to predict whether or not a surface residue is an interface residue using amino acid sequence information. Because surface residues can be identified based on their solvent accessible surface area (ASA), given recent progress in computational methods for predicting ASA from sequence, the approach described in this paper provides a basis for computational prediction of interaction sites in proteins for which only amino acid sequence information is available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characteristics Determination of Rheb Gene and Protein in Raini Cashmere Goat

The aim of the present study was todeterminecharacteristics of Rheb gene and protein in Raini Cashmere goat. Comparative analyses of the nucleotide sequences were performed. Open reading frames (ORFs), theoretical molecular weights of deduced polypeptides, the protein isoelectric point, protein characteristics and three-dimensional structures was predicted using online standard softwares. The f...

متن کامل

Study of PKA binding sites in cAMP-signaling pathway using structural protein-protein interaction networks

Backgroud: Protein-protein interaction, plays a key role in signal transduction in signaling pathways. Different approaches are used for prediction of these interactions including experimental and computational approaches. In conventional node-edge protein-protein interaction networks, we can only see which proteins interact but ‘structural networks’ show us how these proteins inter...

متن کامل

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks

Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...

متن کامل

P-30: The Effect of The T26248G Polymorphism on Putative MethyltransferaseNsun7 Protein Function and Its Role in Male Infertility

Background: Male infertility has many causes, including genetic infertility. The NOP2/Sun domain family, member7 (Nsun7) gene, which encodes putative methyltransferase Nsun7, has a role in sperm motility. The aim of the present study was to investigate the effect of the T26248G polymorphism on Nsun7 protein function and its role in male infertility. Materials and Methods: Semen samples were col...

متن کامل

P-31: The Alteration of SpermatogenesisHas A Correlation with Sertoli Cell Mitochondrial Abnormal Morphology in Cytotoxicity of Testicular Tissue Mediatedwith Monosodium

Background: Male infertility has many causes, including genetic infertility. The NOP2/Sun domain family, member7 (Nsun7) gene, which encodes putative methyltransferase Nsun7, has a role in sperm motility. The aim of the present study was to investigate the effect of the T26248G polymorphism on Nsun7 protein function and its role in male infertility. Materials and Methods: Semen samples were col...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002